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1, Tnj need for ^:':od criterion msasures 
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Hr.ny of rhe qusstions in the use of obsarvation ins^ru:v.ij;Li:s 
answerable if one can test the functional relr.tionships b-:: evonrs 
recorded ch rough these instruments and student outco:aes. ' ./fGruun 1 
acceptable rneasures do not exist for n^any ir;:portant eduea'clonal ou 
the testing of functional relationships is livaited by the cuv^r^.-.t ; 
of the outcome r.easures. Thus, there is a need for reseav^;, ceve 
and reviews on outcome measures • 



Ldequac;^ 



2. Can obaervational ins'cruincrnts be used as outcome n-easures? 

Yes, but they need to be used in situations that ere relatively independent 
of e classroom, otherwise they would be measuring procec.^ and no outcomes • 
Ob-:ervational ins truruents have been used assess criterion b eh a"/:, or in 
correlational and experimental studies / For exaiTiple, the Rus£;ell :...^ic 
Social Relations Test eniploys coding of pupil interaction as a :/rouo of 
students work on constriictiGn--block projects . E:-rperia:cntal research in 
early childhood has eniployed using observational data^of the child iri^his 
environr.en t as both baseline and pos ttes t : data while the treatr.ent took . 
.:-place"an -a^^special 'setting. /■ /■ /■^^ . 

ljut using process T^ieasures taken in claiss as outcotne ixeasures for 
variables such as independence/ curiousity , coope 

rr:£-y lead to unwarranted conclusions. Currently , although there have been 

a nu-::be r of studies relating process measured to student gain in readings, 

the correlations have not been particularly hi^^^ despite the fact that 

investigators chose variables which they expected would be strongly related 

to reading gain. If, as yet, we are unable to establish strong reiationih:.^.., ■ 

in process-outcome studies, then it does not seem legitimate to claitiv thar 

certai-n processes which have not yet been related to outcomes — are 

irnportant in their own right. 
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. fet'7 process variables ^-/hlch ,are i::-paru:;v'.;: in thci:^ 

■ ■ owr. .^ri'ghc.^■/•:,Te;^■chef ^behaviors;;w^ich ■^den^ean or. huiniiia-ce :pupilo :c:re:.'ci:rre:itiy, 
undesirable nomatter whet relationship is deii^ons trated ben/e^n 7vrocei>;^ and 
■ out co^^-e. X Even in such a case one would want to Icncv if the pupils felc 
hu::;ilia::ed; what do we do if the observer believed that the beh::\'icrs . ^ 
: hu:::iliat:ed pupils but the pupils did not feel the same way? ) • '^here are . 
: otlicr teacher behaviors x^hich people appear to advocate on grounu:-: of ta::; te . 
: Arguments about the extent and type of individualization , the choice ^.Vid / .^ 
TTrethod of studying various sub ject areas , and the necessary arr.onnt of joy ' . . 
: in a classroom appear based on grounds of taste rather than r.ora- ^^[rounds : ' 
or research grounds.' I assert that just as school dress codes cj/.not be 
ji'iSti tied by taste alone 3 teachers cannot be held accountable for specific 
classrooni transactions solely on grounds of taste/ 

In. the introduction to his book on testings Ebel presents another 
illustration of the process-outcone rieasurenent issue. He vzrites ^iha^ yac^es 
watching children at play could rriake es tiTiiates of the relative abilities: 
of the children to run fas t , jur^p high , or throw an object far. Me ar;.^v.es 
that ever;/ one concerned would probably prefer to see these estimates rr.ade 
under sor;e standardised and controlled/ if somewhat artificial, conditions 
of a regular track meet/ 

3. There are many foriDS of observation. 

It ".:ould be. a mistake to limit observational instruments to ca^e^^ory 
in^ trun:ents . At present , rating ins truments , : teacher self reports , sl^i 

■ instruments 3 and . student questionnaires are all viable observational 

ins t rumen ts . " At: present v/e do not know whether one form is more function;..- 
zhan another and this is probably a poor question. That is,: some forr^s 
.: may be more functional for some constructs (e.g. teacher positive - responf^es) 
. and other forms, more functional for constructs such as type of : question or . . 
interaction. 

■ ■; : 4. One does : : not validate an. observational instrument . : 

Even in rcsecirch which looks at functional relationships y: one can 
^ . only be^-^in to validate items on an observational ins trumenty not the complete 

■ .■■■•instrument.- .. 

5, • The coding of content covered. 

; The /ievelopment of procedures to code the content which is covered in . 
a classroom is a research need of the: highest priority . At present , there, 
are only three or four observation instruments which include codes on - . 
content. In almost all current observational instruments teacher divergent 
questions on how to arrange a classroom, for example, receive idenuical 
coding; as questions on the application of a principle to new situations. 

ilic R>'.MOS " instrument , developed by R, Calfee and K. iloover, is one 
example of a new instrument with a content dimension. The reading dimension 
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cor.tair.i: sixtGcn options such as simple dacoding, sy llabii'icaLion 
recrj..cional reading, comprehension of relations, and comproheri,-::ion of 
sccu:::'.ce* This dimension could be used with any catep^or^^ or si£;;n iuotru-.^ent 
so that the context of the behavior could be coded with the behavior. 
Sor:'.3 hvporh.eses of critical interest would be: 

a) the correlations bet^-zeen teacher behaviors (or studcint behavior^:) 
and the outcome measure (s) will be strengthened if the behaviors and the 
concent area are coded together; 

b) frequency counts of content behaviors alone will yield a substantial 
correlation with p.upil outcome measure(s). 



6. The dann;er of Tne Great Comp lexifiers . 

In research on functional relationships it is easy to pose so many 
cjuescions and issues that a researcher and a research enterprise c:;:". become 
im;:.obiiized. Tne Great Comp lexifiers are those who pose these additional 
ciu::s lions, much as professors at a preliminary oral keep asking "Have you 
controlled for.,.,", 

Su:v05e one v;ished to mount a series of studies to look az teacher 
quoscicns and student achievement in grades K-3. One could set up a m.atrix 
in v;hich one factor consisted of the four grade levels, and a second factor 
vas cn the subject areas: reading, math, science, social science, Tnus one 
begins v/ith a fairly complex sixteen cell matrix. 

Tr.e great comp lexifiers respond that the number or factors and cells 
are too fev;. They suggest a location factor, suggesting that the schools 
b:: divided into urban, suburban, and rural. Tney suggest an income factor, 
dividing pupils into low income and middle incom.e. They suggest separate 
calls for male and female pupils ; they suggest that race and ethnic background 
ba considered, so that pupils are classified as black, Mexican, Puerto Rican^ 
American Indian, Appalachian, and white. Tne complcxif iers further suggest 
ehac tests are not unidemiisional , and therefore outcomes should at lea^t 
b:i clajsified as recall and processing outcom«es. Finally, another com.p lexifier 
v/ill claim, that there is no such thing as "first grade reading," but rather, 
there is MacMillan reading, Sullivan reading, Bank Street reading, and five 
other kinds of reading curriculum. 

So to a sixteen cell matrix one adds three levels of school location, 
two income levels, two sex levels, six ethnic levels, two outcom.e levels, 
and ei:;ht curriculum levels fora4X4X3X2X2X6X2X8 matrix 
of 1£,C00 cells I 

»2o:;.plcxification and "did you control for" is just plausible enough 
to ser'.e to imr.;obili2e a researcher or a research program because m.c.ny of 
the variables or possible interest are not being studied (or can't be studied 
given the num»ber of cells compared to the possible number of classes). 
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:c:;.:;;:C.v:u:> ;:i;- -'y:^.. vWich;. -of . :ali: ;-p.os^3.iblc.\ cells - :ippar:r .. moro; • p roaiislag • thciri: otiTidrs^;^'.; 
.■■■i j::l:-: , ■ uh-j . :::bsonca: -of accep wed criteria r:CiC3ures will . hc:rper' "such - a ■ venture , ■ 
bL:-j I rjco:v::::and chat an effort be bcgun. to sort out which pieco:i; of this^ 

■ co::;::/l^::ity^see::i. .worthwhile :for -future -research/. 

■ '^■^'i'::' ^^" "bservctlon'.be used for teacher coinpetertcy asse s ysr^ent or for pupil . " 
co:::n<:tehcv- ;r:s^a^s:^enu? 

The .use .of observation for generic teacher cor/tpctency ai^oessraent is 
ur.feasible .at this time because we know so little of functional rc^iationships 
betv;een behaviors and : outcon-,es . A similar argument appears to apply for 
■pupil assessTnent. 

Eut assessment is. possible, within the conte:-;t of curricuiuv.i or pro;^rara 
ir.ple:T.:jntation« In this case the criteria for assessraent variables is one 
or Tr.ore steps removed frorj outco^ie rr.easures.;. in this case the criteria . are ^ 

■ defined by the developer and represent those actions considered ir^portant 
to i;:;ple:T.ent the program according to the intentions of the developer(s) . 

:- Assessment of curriculu.-n i:r;pleinentation , at this poir.t , is not assessnient ■ ' 
of teacher competence or even of prograir; competence. Rather, it is a necessary 

■ first step, in, planning research. Without subsequent research ' iTT.plcrentation . ^ 
assessr.ent is ;not particularly .iricaningful because impler.entation variables . 

are only hypotheses that these variables are important for the outcomes. 

S; The: :iriportanc£ of naturalistic' observation, : 

^iaturalis tic observation ; can s both ■ a source^: and as a : suppleaient ' 

:.:f or- caregoricaiv observat tic ' Observation can served to su(^^:est 

potenc;iaIly;; functional' va have, been;; overlooked : when 

developing;.: arcategory instrument. • '• : ^ 

that van; observer,:; researcher / easily - persuaded : that -the ^^^."^^ 

variables which: strike him as; important are; indeed functional, , 

9 . Dc:.velQpinr?, V^clean^' observational concen ts > 

: .: i ' il'here.: are m.^^ which militate against developing clean obser- . . 

va''ci6nal: concepts .:. Tne first is that there is too much noise to perii^it- 
clear translation of concepts developed outside a classroomi into an observe Lidnal 
ins,iru:::ent". ' For example, consider a -cons truct such as an "analysis quesuion" . ' 
t;;;ha:, from " rhe Bloom et al . Taxonomy , or . a "divergent question'' taker:: from 
C-:! If ovd^G. research . These cons tructs were opera tionalized in v/ritteh . 
rue:^rions . - . In - all probability these cons tructs do -not f it .neatly'; when = 
cocin<j; actual questions in the classroom because there is too much noise. 



i :'j r a .J p o 1:0 c o t li e f i r.s t q i:c o i: i or. > o r a 1 1 a r.v.> t s top r ob e i v. u 0 ar: i : ; :1 z i a 1 
a:^:::^.'^:r . Voc^ chc cCi^cepzs devaioped by Bloovri or Guilford do aa^ xiw a 
"probia;:;" siruation because their concepta were not aeVs^lopcd for iarar.\n:iv< 
a3t:ia£;a. At present^ despite the relative purity of the ori^;ina of th^ 
typaio.jiea developed by Bloou or by Guilford v:e have a great deal of troable 
traaslating ideas developed from one source ir.to tiic claa^jroc:.!, Aaother 
prob2e:u with bloom's or Guilford^ s typology of que^jtioas ic that there are 
so nia-ay oaating typologies. In addition to these two, queationo have beevi 
classified inao six or ir.ore types by B.O, Smith, by Taba, by Brophy^ and 
by Gallagher, We dor/ 1 know how these different typologies coaver;;$e, how 
they differ, and which categories of questions are functional. 



10. Coding cuestions: and coenitive interactions. 

The following sugv^estions for research studies iiiay further illustrate 
the difficulty of obtaining clean observational concepts. 

Suppose one wished to develop a series of studies on the f;-.r:ctionai 
value of different x^ays of coding questions (or cognitive interchan:;es) , 
As noced above, questions have been coded into six or r.or-ii different types 
by a r.uiT.ber of investigators such as Smith, Brophy , Beilack, Blocin.^ Taba., 
Gallagher, and Conners and Eisenberg. At present we do not know v/hether 
these zov.c^vj\:s are similar or different, nor do we knov/ che functional value 
of tnese concepts. 

One v;ay of developing research studies in this area would be to take 
ahree or four sets of specimin tapes and code then using the different 
v/a)'S of coding questions. Bob Soar alone may have enough sets of specimin 
tapes for one such study. Soar has audio -tapes and a nurber of outcome 
-easures for over 100 K~3 classrooms in Follow Through for at least two 
years , 

Ai;suming that Soar's audiotapes represent specimin sets, one could 
code -c/.e tapes using each of the above seven categorizations and relace 
^i.ie obtained frequencies to measures of student achieve'.aent. The rer.ults 
\;ould not validate a particular coding procedure, but they m.ighc ^l1.1 us 
v;hicli specific items were more functional than others in this coatw:r.a, 
S-udiej of zha int:ercorrelations among question types within codin,', uche;..:;.^ 
^;.d -.cross coding schemes could indicate hov/ the question types clujter :.nto 
in^apendenr groupings, Tne results obtained on one set of tapes could be 
cross validated against another set, 

'..nether this approach would yield conceptual clarity and stable faaational 
relationships is testable. An alternative hypothesis would be that ther^ 
are so many ways of developing a coding scheme based on each of the abo\c 
categories of question types that the num.be r of studies v;nich could be run 
(and the number of valid and spurious correlations which could be obtained) 
makes this approach unmanageable, 

i:-/ithin any one set of question codes one still has questions on 

coding single events or coding sequences 
Q the unit of analysis (e,g. frequency, move, utterance > 

ERJC cycle, topic, etc) 



the nu:.-:bcr of different question^^ which fit Ir.to o;:e 
quascion type 

the r.ur;,bar of dir.^.ensions (e.g. speaker, tone, coTi'car.z) zo 

include within a count 
the scale zo be used to estimate frequency (e.g. Cc;tc:;;o'_-y or 

sign method) 

Whether different procedures for using the sarr.e concepts of cu.:s-ions v/iil 
yield different, ccnsistenc, and functional results is a research cuer.'uion. 
Although I v;ould guess that the results will be uninterp reliable > 1 would 
recor.-r.end that this series of studies be run in the hopes of de cer.v.ining 
whether there are err;pirical procedures which nught yield conceptual clarity • 

'"fhe list above of issues in the technology of coding questicns arc 
±:\^^p^v^C2^.t of the theoretical origins of a set of categories of ques-aion 
typc^ ♦ Even if one decides to take the variables and their nair.os from the 
theory/ and research of Dev;ey , Piaget, Miller, or Skinner^ one still has zo 
make decisions on the unit of analysis, the number of type of ceding div/.ensions , 
and other issues. There are no guidelines for these decisions no matter 
how clean the theoretical origin of the observational instrument. 



10a, A rynolo^y of questions and cognitive interactions. 

r:uch as the idea of a typology of questions h:::s appeal, the developr.ent 
of auch a 'cypology is difficult to conceive because of the variety of x7ays 
which exist to code questions (e.g. the codings developed by such as Sv.rl^h, 
31oo::;^ Tabn, and others) and the variety of recording procedures v;hich raight 
be used. The use of a data bank of interactions and outconies to test Xv'hich 
cuesjions rypes and which recording procedures are most useful £ee;r;S appealing > 
bu'c I v/orr>' that the results of such a series of studies will not yield 
clean results. My worry, however, is testable. 

10b. De^iern^ining functional units, approaches, and recording nrocedures. 

If one returns to i:he orample of recording questions or cognitive 
interactions, there ren^ain a nuir.ber of unresearched research isi^,ues- 
Vraen developing an observational instrument, one must make decisions on: 

the nUiT.ber of different behaviors to be included in a 
variable (e.g. are all instances of praise to bo 
considered as one variable, or will subdivisions be 
nade for different apparent forms of praise; similar for 
criticism, feedback, types of questions) 
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;:hc r:u:.l/cr of cii^^anoions to ba eacodac wi::h oach behavior 

(such c.ir.ensions could include the content, thc: source, 
uhe nu;-'ber of students attondinf;, the f irirnes::: of ::ho 
interaction^ the role the teacher was in, addition;il 
cognitive and affective dir.ensions to an interaction) 

•che unit of recording (e,g» natural unit, simple county 
sign count , rating) 

hov; ;;uny sequences should be recorded (e.g. single instances, 
diads, triads) 

v.-hCiiher smaller variables should be con:bined for analyses 

v.'hether racios of behaviors should be used for analyses 

The above list seens av7eso::ie and sirrilar to the great corr^p lexifiers 
ai-;;u;.'.enes . In this case, if a variable did not correlate v;ith student 
Guzcc-.:es, one could argue that the variable would have been sigriifican";: 
if only -che si^e of the variable or the number of diir.ensions or the unit 
of recording or something else had been different. 

The above argument seems as unresolvable as the argument of rhe great 
comfilcMif iers . Tne suggested additional procedures for encoding observations 
seems as plausible and researchable as the additional contexcs suggested 
by zr.c com.plexif iers . 

The issue is further coTr.plicated by the also plausible idea that one 
type of uni'c of analysis unit (for example) may be most functional for 
one variable and another type of unit for another variable. 

Some research seems called for to deuerr.iine the functional value of 
scm^ of the above questions. But, I don't believe that one can tackle or 
expect to tackle all these questions. The best one could hope for would 
be to focus on those issues in the above list which most people consider 
relevant, and such a needs assessmient could be done by sending a checklist 
to a panel of experts. 

11. Indexin^i: im;nlem.entation . 



One could m.ake a case that the indexing of prograra implementation is 
a fairly straigh tf or^^/ard matter. One takes the behaviors considered ir.por"ca::t 
by che developer, develops an observation instrument, and uses the instrumen-c 
to develop an index of implem.entation. 



;: ;:::"v:/;.:' :Ir:v?-.rc^cticci'v studying . lKipiev»:cntation;'; ;, 

i^iTj 'J:r:ir.r: different chinas . Soar , for exair^ple , did not go to t,\e ^pro^^c^^ 
ccv^lcz :::vs ioL lists of [^'cizlcal behaviors. Instead he chooe ooservcitior. ; 

'inG-jnr.r.entiv '..'hich he believed reflected the differences across cji^ht Follow 
Ihrouf^h prosraiv^s and used these results as an index of i:;"ple::;oni:ation. 
Soar first factor analyzed his results and then deterradned uae ran .{C of .\ 
classroor.s v;ithin each program on the relevant factors . vTiien the within 
prograiti range v7as smaller than the across program range this v/as ta'cen as 
indicating successful imp len:entation. Using the Newman-Keh Is procedure, 
Soar found a nurfDer of relevant dimensions on which progratiis differed and 
such differences usually reflecred the a-priori orientations of the prograrris 
Thus, irnplementation for Soar meant differentiation. 

Stailings also used a differentiation procedure to inde:-: iiKplen'.entaticn 
liovever, her observarional instrurrient was constructed differently. Stalling 
first observed Follow Through classrooms and used her notes on the different 
r.odels to construct her instrument. Following this, she asked each sponsor 
to select those variables considered most it:>oortant for imp lerr.en ting their 
program and to further programs or control classrooms . 

Siegel developed a set of implementation variables only for the 
DISTAR (or Oregon) Follow Through Model, Illustrative /c.riables were: 

teacher follows the program format when working with 
\ the ' entire group v.- . : 

correction procedure for mistakes when pupil does not 
understand teacher's signal 

repeating task from beginning when pupil does not 
unders tand teacher's signal 

ratio of attempts to obtain a unison response to the 
number of non-^unis on responses 

Vari.^bles such as the above could be used to observe any program, but -the . • 
behaviors are most likely to occur in : the j/t STAR program or in a similar ■ 
structured, interactive program such as the Southwest Lab Communication 
" Skills "-prograin.'.:" ' 

These three investigators , all of \-7h0m were in teres ted in inde>:ing , ' 
■implemen talon , have used three different procedures to do so. The variables, 
each has selected differs from the others both in the range of events 
covered and the level of specif i city . (VJhether a greater range or a more 
detailed level of specificity is functional is an empirical question.) 
One might expect that other investigators would come up with still other 
procedures for indexing implementation. So how does one proceed? 



■•r;;;.c:t -^-rzblcrs shonlc: be studied :::nd whr^n? 



7p.e ^bo'/yj question, raised by Joy during; the; Tr.:::^ t:Vn[: cocr.:::; vary 
::: 7.or :c.r.: . I v:ould rciconxend that lists of research probiev.^ be devel^-ped 

panel, such as the one v/hich luet, acueinpt zo see if they can c-s;-;i:^n 
priorities to the research problenis, 

Sor.e research questions which I offer are: 

n) in v;hat settinj/s should the research take place? (naturalistic, 
standard situation, specific curriculuri product.) 

b) v;hat types of variables should be selected (those vhich focus 
on curriculuri-emphasised activities, those which include ^;eneral 
instructional variables) . 

c) v;hat type of recording scale is ir.ost functional for v/hat type 
of iterj^V (frequency count by cine, frequency count by natural unit, 
si^n count, rating). 

d) what variables • are worth studying? 

a) what contextual variables (e.g. pupil parent inaor/,e, school 
location, curriculum, boy/girl ratio) are most iiriportant? 

I ar.: not sure how one v/ould go about making decisions about priorities 
for studying the above issues. I recorinend, for starters in a discussionj 
that (c) — the recording scale — and (d) variable selection be given top 
priority because solution of these issues is necessary for work on the 
neict issues* 



13. Tne need for a list of research issues. 



The above issues and problems are certainly not e>:haustive. I recG::itT.e; 
t'-.at a lii:t of issues and problems be developed and that a panel work on 
(a) defining the issues and (b) placing research priorities on these issues 
and (c) suggesting research stragtegies. 

14. A proposal for a data bank for secondary analyses. 



If a data bank v;ere available then nany of the issues v;e raise could 
be .-.ubjected to en.pirical study. At the minimuui, a data bank would include 
i.:furr.ation on classroom transactions and on student outcomes. Transaction 
daca could be on videotape or audiotape as well as in pupil questionnaires,, 
observer ratings, and category counts. Bob Scares material is an e:-:a;v.ple 
cf one type of m.aterial which could be included in such a bank. 



\ 



ciata bank need not 
:e iZQT^s within a 



t oe St 
bank 



torec: in any single place, but it 
assessible to rasearchers. 



L;nportcn 



15. Th3 irr.oor^ance of instructipnal research within curriculum uro:^rrvV:'.s , 



I-niether observational research within curriculum prograir.3 v;iil yield 
better re;3ults than obcerva'cional Vesearch which ij^nores progran diotinccions 
i5 a tcu^table hypothciiis. The 2rf,J\rr.cnt is uade here that sor;.e curriculum* 
nrocucts provide teachers with tool:\ v;hich they would nor receive if tliey 
•..■orkee withouc the program. To an u\\knov7n extent, these tools facilitate 
s tuden : 



!£; tn 



^.ypo 



thesi/.ed that inc r.Vuctional researc;: which airr.ed at 



i'.Tipact or i^e^ecred curric\riuni prosrar;.s will be more effective 
eaearch wh.ich atter.:pts to irr.prove Vne ^.eneral impact of teacn^rs 
acrcoi. pro;;;rar/^ . This iii not to say thao. general instructional variables 
sbould not be studied (even within the coAteKt of specific pro^ra::.s) , 
but, raaher, that the payoff would be greatest when research takes place 
wi th in specific p r o gr arr.s • 



\ 



A rr.ajor reason for the above av/^urrent hais? been the researc:: in ::'-LanawC 
Variation rollow Through, At the sur,rnative leVel, certain types or prov;ra;v.s 
have been ccnsis ten-cly r.ore successful in engin^-ering pupil achieve':.;::nt 



uc^cest thav these su 



£;ru^ pro'^.ra7.:s 



have been extrer.-.ely successful tools for the teachers 
sugf-esrec that observational work desisted to iinprove these tools woulc 
be a V7ise inves tr.;ent- \ 
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